56 research outputs found
Graph Neural Networks can Recover the Hidden Features Solely from the Graph Structure
Graph Neural Networks (GNNs) are popular models for graph learning problems.
GNNs show strong empirical performance in many practical tasks. However, the
theoretical properties have not been completely elucidated. In this paper, we
investigate whether GNNs can exploit the graph structure from the perspective
of the expressive power of GNNs. In our analysis, we consider graph generation
processes that are controlled by hidden node features, which contain all
information about the graph structure. A typical example of this framework is
kNN graphs constructed from the hidden features. In our main results, we show
that GNNs can recover the hidden node features from the input graph alone, even
when all node features, including the hidden features themselves and any
indirect hints, are unavailable. GNNs can further use the recovered node
features for downstream tasks. These results show that GNNs can fully exploit
the graph structure by themselves, and in effect, GNNs can use both the hidden
and explicit node features for downstream tasks. In the experiments, we confirm
the validity of our results by showing that GNNs can accurately recover the
hidden features using a GNN architecture built based on our theoretical
analysis
Poincare: Recommending Publication Venues via Treatment Effect Estimation
Choosing a publication venue for an academic paper is a crucial step in the
research process. However, in many cases, decisions are based solely on the
experience of researchers, which often leads to suboptimal results. Although
there exist venue recommender systems for academic papers, they recommend
venues where the paper is expected to be published. In this study, we aim to
recommend publication venues from a different perspective. We estimate the
number of citations a paper will receive if the paper is published in each
venue and recommend the venue where the paper has the most potential impact.
However, there are two challenges to this task. First, a paper is published in
only one venue, and thus, we cannot observe the number of citations the paper
would receive if the paper were published in another venue. Secondly, the
contents of a paper and the publication venue are not statistically
independent; that is, there exist selection biases in choosing publication
venues. In this paper, we formulate the venue recommendation problem as a
treatment effect estimation problem. We use a bias correction method to
estimate the potential impact of choosing a publication venue effectively and
to recommend venues based on the potential impact of papers in each venue. We
highlight the effectiveness of our method using paper data from computer
science conferences.Comment: Journal of Informetric
Neuregulin 1 Type II-ErbB Signaling Promotes Cell Divisions Generating Neurons from Neural Progenitor Cells in the Developing Zebrafish Brain.
Post-mitotic neurons are generated from neural progenitor cells (NPCs) at the expense of their proliferation. Molecular and cellular mechanisms that regulate neuron production temporally and spatially should impact on the size and shape of the brain. While transcription factors such as neurogenin1 (neurog1) and neurod govern progression of neurogenesis as cell-intrinsic mechanisms, recent studies show regulatory roles of several cell-extrinsic or intercellular signaling molecules including Notch, FGF and Wnt in production of neurons/neural progenitor cells from neural stem cells/radial glial cells (NSCs/RGCs) in the ventricular zone (VZ). However, it remains elusive how production of post-mitotic neurons from neural progenitor cells is regulated in the sub-ventricular zone (SVZ). Here we show that newborn neurons accumulate in the basal-to-apical direction in the optic tectum (OT) of zebrafish embryos. While neural progenitor cells are amplified by mitoses in the apical ventricular zone, neurons are exclusively produced through mitoses of neural progenitor cells in the sub-basal zone, later in the sub-ventricular zone, and accumulate apically onto older neurons. This neurogenesis depends on Neuregulin 1 type II (NRG1-II)-ErbB signaling. Treatment with an ErbB inhibitor, AG1478 impairs mitoses in the sub-ventricular zone of the optic tectum. Removal of AG1478 resumes sub-ventricular mitoses without precedent mitoses in the apical ventricular zone prior to basal-to-apical accumulation of neurons, suggesting critical roles of ErbB signaling in mitoses for post-mitotic neuron production. Knockdown of NRG1-II impairs both mitoses in the sub-basal/sub-ventricular zone and the ventricular zone. Injection of soluble human NRG1 into the developing brain ameliorates neurogenesis of NRG1-II-knockdown embryos, suggesting a conserved role of NRG1 as a cell-extrinsic signal. From these results, we propose that NRG1-ErbB signaling stimulates cell divisions generating neurons from neural progenitor cells in the developing vertebrate brain
Embarrassingly Simple Text Watermarks
We propose Easymark, a family of embarrassingly simple yet effective
watermarks. Text watermarking is becoming increasingly important with the
advent of Large Language Models (LLM). LLMs can generate texts that cannot be
distinguished from human-written texts. This is a serious problem for the
credibility of the text. Easymark is a simple yet effective solution to this
problem. Easymark can inject a watermark without changing the meaning of the
text at all while a validator can detect if a text was generated from a system
that adopted Easymark or not with high credibility. Easymark is extremely easy
to implement so that it only requires a few lines of code. Easymark does not
require access to LLMs, so it can be implemented on the user-side when the LLM
providers do not offer watermarked LLMs. In spite of its simplicity, it
achieves higher detection accuracy and BLEU scores than the state-of-the-art
text watermarking methods. We also prove the impossibility theorem of perfect
watermarking, which is valuable in its own right. This theorem shows that no
matter how sophisticated a watermark is, a malicious user could remove it from
the text, which motivate us to use a simple watermark such as Easymark. We
carry out experiments with LLM-generated texts and confirm that Easymark can be
detected reliably without any degradation of BLEU and perplexity, and
outperform state-of-the-art watermarks in terms of both quality and
reliability
Momentum Tracking: Momentum Acceleration for Decentralized Deep Learning on Heterogeneous Data
SGD with momentum acceleration is one of the key components for improving the
performance of neural networks. For decentralized learning, a straightforward
approach using momentum acceleration is Distributed SGD (DSGD) with momentum
acceleration (DSGDm). However, DSGDm performs worse than DSGD when the data
distributions are statistically heterogeneous. Recently, several studies have
addressed this issue and proposed methods with momentum acceleration that are
more robust to data heterogeneity than DSGDm, although their convergence rates
remain dependent on data heterogeneity and decrease when the data distributions
are heterogeneous. In this study, we propose Momentum Tracking, which is a
method with momentum acceleration whose convergence rate is proven to be
independent of data heterogeneity. More specifically, we analyze the
convergence rate of Momentum Tracking in the standard deep learning setting,
where the objective function is non-convex and the stochastic gradient is used.
Then, we identify that it is independent of data heterogeneity for any momentum
coefficient . Through image classification tasks, we
demonstrate that Momentum Tracking is more robust to data heterogeneity than
the existing decentralized learning methods with momentum acceleration and can
consistently outperform these existing methods when the data distributions are
heterogeneous
- …